JPEG files are predictable in their structure, with each segment of
information (whether metadata in a header or the image itself)
delimited by well-known hex values called “markers.” The general
structure is as follows:
JPEG Image
| Bit Value | Description |
|---|---|
| 0xFF 0xD8 | Start of Image, the first bytes of a JPEG file. |
| 0xFF [Segment ID] | a Marker indicating a new segment. Each type of segment has a unique ID. |
| 0XFF 0xD9 | End of Image, the last bytes in a file. |
Here are some additional JPEG Segment markers
| Bit Value | Name | Description |
|---|---|---|
| 0xFF 0xE0 | APP0 | Application Marker (in every JPEG) |
| 0xFF 0xDB | DQT | Quantization Table |
| 0xFF 0xC0 | SOF0 | Start of Frame |
| 0xFF 0xC4 | DHT | Define Huffman Table |
| 0xFF 0xDA | SOS | Start of Scan |
| 0XFF 0xED | APP14 | Photoshop storage * The one we need * |
The information we’re interested in is stored in a segment known as APP14, 0xFF 0xED. The App14 Segment contains the following structure:
App14 Segment
| Bit Value | Description |
|---|---|
| 0xFF 0xED | start of APP14 Segment |
| 2 bytes | the segment size, excluding the marker, but including these two bytes. |
| Photoshop 3.0\x00 | A fixed string |
8BIM Segments individual fields in the APP14 segment. An 8BIM segment in turn has the following structure:
| Bit Value | Description |
|---|---|
| 8BIM | a four byte segment marker (this is, in fact, the string) |
| Segment Type | two bytes indicating the segment type |
| Zero padding | 4 bytes of 0 |
| Segment size | two bytes, excluding the marker, type, padding, and segment size |
| Segment data | the actual data of the 8BIM segment |
Inside the 8BIM segment’s data are additional subsegments, indicated
as such:
| Bit Value | Description |
|---|---|
| 0x1C 0x02 | Subsegment marker |
| Segment type | 1 byte indicating the type of marker |
| Segment size | 2 bytes excluding the marker, type, and size |
| Segment data | the data |
The IPTC keyword itself is then stored in one of these sub-segments; specifically, type 0x19. There may be multiple of these keyword subsegments as the standard allows for more than one per image.
A program that manipulates these keywords, then, must do the following: